Describe the morphology and mitochondrial genome of Mecidea indica Dallas, 1851 (Hemiptera, Pentatomidae), with its phylogenetic position

We here describe the external morphology and complete mitochondrial genome characteristics of Mecidea indica Dallas, 1851, and clarify the evolutionary rate and divergence time. The M. indica mitochondrial genome length is 15,670 bp, and it exhibits a typical high A+T-skew (76.31%). The sequence shows strong synteny with the original gene arrangement of Drosophila yakuba Burla, 1954 without rearrangement. The M. indica mitochondrial genome characteristics were analyzed, and phylogenetic trees of Pentatomidae were reconstructed using Bayesian methods based on different datasets of the mitochondrial genome datasets. Phylogenetic analysis shows that M. indica belongs to Pentaotominae and form a sister-group with Anaxilaus musgravei Gross, 1976, and Asopinae is highly supported as monophyletic. Molecular clock analysis estimates a divergence time of Pentatomidae of 122.75 Mya (95% HPD: 98.76–145.43 Mya), within the Mesozoic Cretaceous; the divergence time of M. indica and A. musgravii was no later than 50.50 Mya (95% HPD: 37.20–64.80 Mya). In addition, the divergence time of Asopinae was 62.32 Mya (95% HPD: 47.08–78.23 Mya), which was in the Paleogene of the Cenozoic era. This study is of great significance for reconstructing the phylogeny of Pentatomidae and providing insights into its evolutionary history.


Introduction
Pentatomidae is the largest group of species in the superfamily Pentatomoidea and is widely distributed worldwide.Currently, approximately 5000 species and more than 900 genera have been recorded [1,2].Most species of Pentatomidae are herbivorous, and many species are considered to be primary crop pests worldwide, causing huge losses every year [2].Phytophagous species feed on the liquid flowing in the vegetative organs of the host plant through their piercing-sucking mouthparts, causing plants to wither and/or die.They are important agricultural and forestry pests [3].For example, Nezara viridula (Linnaeus, 1758) damages rice; Halyomorpha halys (Stål, 1855) damages apples, pears, and other fruit trees; and species of the genus Eurydema Laporte, 1833 damages cruciferous vegetables.However, most species of Asopinae (Heteroptera: Pentatomidae) are predatory stink bugs that feed on the larvae of Lepidoptera and Coleoptera and can be used for biological control [4][5][6].
The genus Mecidea (Hemiptera: Pentatomidae) comprises a group of stink bugs that occur in subtropical and adjacent temperate parts of the world.Within these regions, the distribution of the genus appears to coincide closely with that of xerophytic or semi-xerophytic environments [7].This coincidence was established by Dallas in 1851 for two species, indica (Bengal) and linearis.Mecidea indica is a member of this genus.Sailer reviewed the genus Mecidea in 1952, including M. indica, and provided species literature, identification keys, descriptions and figures [7].Hsiao et al. (1977) [8] recorded this species in China, and provided habitus photographs and brief descriptions.Rider and Zheng (2002) [9] updated the distribution of this species in China.Rider (2006) [10] provided the most recent worldwide distribution information on this species.The latest literature on M. indica was provided by Fan (2011) [11], who produced a description that lacked genitalia information.
A typical insect mitochondrial genome is a double stranded covalently closed circular DNA molecule, including 37 genes (13 protein coding genes (PCGs), 22 transport RNA genes (tRNAs), and two ribosomal RNA genes (rRNAs)) and a control region [12,13].Mitochondrial genomes are widely used in molecular evolution, population genetic structure, biogeography studies, and phylogenetic analysis, due to their small size, stable genetic composition, relatively conservative gene sequence, and complete molecular information [14][15][16][17].
Current classification of the tribes and subfamilies of Pentatomidae is based on traditional taxonomic studies.Rider et al. (2018) [2] described each tribe and subfamily of Pentatomidae based on their morphology, providing a good framework for phylogenetic analysis.In recent years, increasing amounts of molecular data on pentatomid species have become available, but most of the studies to date focused on the high-level hierarchical relationships, such as Pentatomoidea or Pentatomomorpha.For example, Yuan et al. (2015) [18] constructed the phylogenetic tree based on a 13 PCGs dataset, which strongly supported the monophyly of Pentatomoidea.Mu et al. (2022) [19] supported this result.Xu et al. (2021) [20] constructed a phylogenetic tree based on PCGRNA and PCG12RNA datasets using 55 species of Pentatomoidea, and resulted that site-heterogeneous mixture models can provide a more stable phylogenetic relationship.Grazia et al. (2008) [21] supported the monophyly of Pentatomidae based on morphological and molecular characteristics, and Zhao (2017) [22] supported this result.In a recent study, Genevcius et al. (2021) [23] used 69 morphological characteristics and five DNA loci to study the phylogeny of Pentatomidae, and reported that most subfamilies and tribes included in Pentatomidae were not monophyletic.Roca-Cusachs et al. (2022) [24] simultaneously rejected the currently accepted monophyletic nature of Pentatomidae.Owing to a lack of robust phylogenetic methods and incomplete sampling, the internal relationships of Pentatomidae remain largely unknown.
We used phylogenetic and molecular clock analyses to explain the origin and evolution of Pentatomidae.Previously, Li et al. (2017) [25] analyzed phylogeny, reconstructed the ancestral characteristic state, and estimated divergence time, indicating that insect diversity may be largely due to coevolution with angiosperms, and key adaptive innovations (such as prognathous mouthpart and predatory behavior) facilitated multiple independent shifts among diverse feeding habits.This study provides a good reference for determining the origin of Pentatomidae.However, no studies have systematically evaluated the divergence time of Pentatomidae; therefore it is particularly important to study the evolution of Pentatomidae by combining fossil data with molecular characteristics.
In this study, we provide a description of the morphological characteristics of M. indica, publish a complete mitochondrial genome obtained by high-throughput sequencing, and describe our detailed analyses of mitochondrial genome characteristics.By analyzing codon preference, RNA secondary structure, and evolution rates among Pentatomidae species, we can clarify internal relationships among Pentatomidae.In addition, our results from constructing phylogenetic trees of Pentatomidae and evaluating divergence time will help in understanding Pentatomidae evolution.

Descriptions and measurements
Male genitalia were observed and illustrated after treatment with warm 5% NaOH solution for approximately 20 min.Female genitalia were only illustrated externally.Photographs of both dorsal and ventral habitus were taken using a Nikon SMZ1000 microscope equipped with a computer-controlled SPOT RT digital camera and Helicon software.The terminology used to describe the external genitalia follows that of Fan et al. (2011) [11].All measurements were performed in millimeters.
Body length was measured from the apex of the head to the tips of the membrane of the hemelytra.Head width was measured between the eyes, and head length was measured from the tip of the head to the midpoint of the anterior margin of the pronotum.Pronotum length was measured from the midpoint of the anterior margin to the midpoint of the posterior margin, and width was measured across the greatest width of the pronotum.Scutellum length was measured from the midpoint of the anterior margin of the scutellum to the apex, and width was measured across the basal angles.

Sample collection and DNA extraction
Adult M. indica specimens were collected from Xiaochantan Wharf (109˚10 0 E, 19˚43 0 N), Yangpu Port, Danzhou City, Hainan Province, China, on December 22, 2020.The species we used for scientific purposes is not protected animals and meet animal ethical requirements.It is ethical, humane and responsible.All specimens were immediately placed in absolute ethanol and stored in a freezer at -20˚C until DNA extraction.Total DNA was extracted from thoracic tissue using a Genomic DNA Extraction Kit (Sangon Biotech, Shanghai, China).

Sequencing, assembly, annotation and sequence analyses
A fluorescent dye Quant it PicoGreen dsDNA Assay Kit was used to determine the total amount of DNA.The total amount of DNA was 2.39 μg, and concentration by fluorescence was 47.80 ng/μl.After quality inspection, the required genomic library was constructed using the standard Illumina TruSeq Nano DNA LT library preparation process (Illumina TruSeq DNA Sample Preparation Guide).The mitochondrial genome of M. indica was sequenced on an Illumina Novaseq 6000 Platform, using the sequencing mode was paired-end 2 × 150 bp.Fastp v 0.23.1 [26] software was used to filter the original data to obtain high-quality clean data.Geneious v. 11.0 [27] software was used to assemble and annotate the sequences.Reference sequence (Plautia lushanica Yang, 1934, NC_058973) [20] for assembly and annotation was obtained from the NCBI databases.The PCGs were edited manually using the open reading frame finder (ORF) (http://www.ncbi.nlm.nih.gov/gorf/gorf.html)with the invertebrate mitochondrial code.The locations of each protein-coding gene's initiation and stop codons were determined by comparison with homologous genes from other insects.MITOS Web (http://mitos.bioinf.uni-leipzig.de/)[28] was used to predict the locations and secondary structures of the 22 tRNAs.The boundaries of the two rRNAs genes were determined by comparison with those of previously reported mitogenomes.The location of the control region was identified by the boundaries of the neighboring genes.
A circular map of the M. indica mitochondrial genome was produced using the CGView Server [29].Codon usage and nucleotide composition of the PCGs were determined by MEGA v.11.0 [30], and the skew in nucleotide composition was calculated by the following formula: AT-skew = (A − T) / (A + T); GC-skew = (G − C) / (G + C) [31].Codon W1.4.2 [32] was used to calculate the effective number of codons (ENCs) in the 13 PCGs observed in 50 Pentatomidae species.To study the pattern of evolutionary divergence among the mitochondrial genomes of Pentatomidae species, DnaSP v.6.12.03 [33] was used to count non-synonymous substitutions (Ka) and synonymous substitutions (Ks) in the 13 PCGs of Pentatominae and to calculate Ka/Ks values.In addition, MEGA v.11.0 was used to calculate the conservative sites of tRNA and rRNA genes, and tandem repeats within the control region were identified using the Tandem Repeats Finder server (http://tandem.bu.edu/trf/ trf.html) [34].
The PCGs and RNA genes were extracted using Geneious v.11.0, and MEGA v.11.0 was used to align multiple protein and RNA coding genes sequences.The connection of multiple sequences for each species was achieved using Sequence Matrix v.1.7.8 [35].Gblocks [36] was used to delete ambiguous sites.
Before constructing a phylogenetic tree, base substitution saturation and sequence composition heterogeneity analyses were performed on both datasets.DAMBE v.7.0.35 software [37] was used to calculate the base substitution saturation index.If Iss < Iss.c indicates that the dataset can be used for phylogenetic analysis.Heterogeneity analysis was performed using Ali-GROOVE v.1.0.8 [38].Datasets with less heterogeneity were suitable for phylogenetic analysis.

Divergence time estimate
The relaxation clock lognormal model in BEAST v.1.8.4 [42] was used to estimate Pentatomidae divergence time based on the PCGs dataset.We set up a GTR+I+G partition model using the calibrated Yule model for the prior tree.The fossil information points of Pentatomidae and the genus Eurydema Laporte de Castelnau, 1833 [43][44][45] were used for calibration.Tracer v.1.7.2 [46] was used to confirm the chain convergence.The Markov chain was run twice for every 5×10 8 generations, sampling every 1000 generations with a burn-in of 25%.The valid sample size for most parameters was greater than 200.Sample trees were aggregated using Tree Annotator v.1.1.8.4, and then 95% highest probability density (95% HPD) was displayed in Figtree v1.4.3 [47].

Redescription of Mecidea indica Dallas, 1851
The body is long and narrow, and dorsum is yellow-white or yellow-brown, mottled with irregular fine dark spots.The venter is yellow-white, with two black longitudinal bands on each lateral side.Light brown punctures are observed on the head and thorax, and punctures are absent or shallow on the abdomen (Fig 1).The head is triangular, somewhat pointed anteriorly, the juga is longer than the tylus, convergent in front, with straight lateral margin.The eyes are large and prominent, orange, globose, with ocelli located at the posterior margin.Antennae are five-segmented, the first segment is white-yellowish, and does not extend beyond the end of the head; the second segment is extremely long and stout, about twice the length of the third segment, and has three edges, one of which is slightly flattened outward; the remaining segments are cylindrical.The anterior angle of the bucculae protrudes semi-circularly.Its outer margin is relatively straight, and the posterior angle gradually disappears, not exceeding the posterior edge of the eye.The rostrum extends between the mesocoxae and the metacoxae; its first segment does not exceed past the bucculae; the second segment is longer than the two apical segments.The pronotum is more than three times as long as its wide, its dorsal surface is comparatively flat and coarsely punctured, except for the callus.Humeral angles are round and slightly prominent; anterior angles are short, pointed, and slightly protruding, with their apex flush with the outer margin of the compound eye.The anterior margin is concave, not wider than the distant between eyes, and the posterior margin is straight.The anterior lateral margin is slightly concave, and minutely serrated.The scutellum forms an extremely elongated triangle.Its apical third is yellowish-white, and its apex extends more than half the length of the abdomen.Its lateral margin is narrow with thin edges.The corium is dark, with deep black punctures.The exocorium is usually paler than the corium, yellowish-white, with membrane obviously beyond the abdominal end.A smooth and slightly raised central ridge is longitudinally situated and extends from the base of tylus to the apex of scutellum.The proepisternum is simple; midline of mesosternum is carinate; and the midline of the metasternum is broad with shallowly sulcates.The metathoracic scent gland ostiole extends nearly to the dorsoanterior angle of the pruinose area, its apex sharp.The femora are unarmed, tibiae sulcate, and tarsi 3-segmented, with segment one equivalent to the length of segment two and three.The base half of the claw is yellowish-white, and the apical half is brown.
The abdomen with very shallow or without punctures, two black longitudinal belts are observed on the lateral side.The base of sternite III lacks tubercle.The connexivum are not exposed, and each segment has a black spot around the stoma.
Male genitalia.The pygophore is cup-like, and its width is greater than length, and densely covered with long hair.The posterolateral angles are horned and black; dorsoposterior rim concave and sinuate; and ventroposterior rim have a deep cup-like concave in the middle, with a sharp angle (Fig 2A and 2B).The paramere is simple without any branch, and apex with an elongated black spot (Fig 2C).Aedeagus is simple with one pair of basolateral conjunctival lobes, which apices are not bifurcate but slightly sclerotized, ventral and apical conjunctival lobes haven't been seen; median penial plates strongly sclerotized, united at the base, and distinctly concave apically; vesica protrudes from venter of the median penial plates (Fig 2D).
Female genitalia: The first gonocoxites are large and plate-like, with their inner margins arched and clearly separated.The eighth paratergites are long and oval, with long hair at the apices.The ninth paratergites are also long and oval, with their apices much longer than those of the eighth (Fig 2E).

Mitochondrial genomic structure
The M. indica mitochondrial genome is a double stranded circular DNA with a length of 15,670 bp (GenBank accession no.OR654110), containing 37 genes (13 PCGs, 22 tRNA genes, two rRNA genes), and a control region (Fig 3).The arrangement of the 37 genes is consistent with that of the typical insect Drosophila yakuba Burla, 1954, with no gene rearrangement.Fourteen genes are encoded on the N-strand, and 23 genes are encoded on the J-strand (Table 2).The nucleotide composition of the M. indica mitochondrial genome is: A (42.97%) >T (33.35%) >C (12.79%) > G (10.89%), AT (76.31%) > GC (23.69%), showing AT-skew and CG-skew (Table 3).The M. indica mitochondrial genome contains 15 gene spacers and six gene overlap regions.The gene spacers are 1-24 bp in length, with a total length of 99 bp.The lengths of the overlap regions are 1-8 bp, with a total length of 27 bp.The greatest gene overlap is observed between trnW and trnC.

Protein coding genes
The nucleotide composition of the  We analyzed the relationship between the effective number of codons (ENC), GC content of all codons, GC content of the first codon position (GC1), GC content of the second codon position (GC2), and GC content of the third codon position (GC3) to further explore the codon usage patterns of Pentatomidae species.The results showed that ENC has a strong positive correlations with GC and GC3 (R 2 >0.95), while ENC has a weak positive correlation with GC1 and GC2 (R 2 <0.75) (Fig 5).
We calculated the synonymous substitution rate (Ks) and non-synonymous substitution rate (Ka) of the PCGs of Pentatomidae.The evolutionary rates of the PCGs are in the order of 6).The results showed that Ks>Ka and Ka/Ks<1, indicating that evolution occurred under purifying selection.

Control region
The control region of M. indica is located between rrnS and trnI (GAT), and is 1046 bp in length.The nucleotide composition of the control region is: T (40.57%)>A (37.24%)>C (13.88%)>G (8.31%), and AT (77.81%)>GC (22.19%), showing TA-skew and CG-skew.We observed eight tandem repeat sequences in the control region with a length range of 18-149 bp (Table 4).

Phylogenetic relationships
Before reconstructing the phylogenetic tree, we performed saturation and heterogeneity analyses on the two datasets (PCGs and PRT).The saturation analysis showed that the sequences of the two datasets are not saturated (Iss<Iss.c, and p<0.05) ( Fig 10).Heterogeneity analysis revealed that the composition of the sequences exhibited low heterogeneity ( Fig 11).Both studies indicated that these datasets were suitable for phylogenetic studies.

Divergence time estimation
We evaluated the divergence time of the Pentatomidae based on the PCGs dataset (Fig 13).The results revealed that the divergence time of the Pentatomidae was 122.75 Mya (95% HPD: 98.76-145.43Mya), which was in the Aptian stage of the early Cretaceous period

Discussion and conclusions
In this study, we sequenced the complete mitochondrial genome of M. indica using secondgeneration sequencing technology.The arrangement of the 37 genes was consistent with that of published Pentatomidae species [52,53,65], indicating that no gene rearrangements have occurred.The nucleotide composition of the mitochondrial genome of M. indica exhibits high AT content, and base composition heterogeneity is common in Heteroptera species [49].
Codon usage bias is a process by which species gradually adapt to their growth environments during evolution.Analyzing codon usage can aid studies of evolution and environmental adaptability of different species.In the M. indica mitochondrial genome, we observed a significant AT bias in the nucleotide composition and a preference for codon usage ending with A/T.The evolutionary rate of Pentatomidae was less than one, indicating that they have been subjected to purification selection.The evolution rate of atp8 was the fastest, whereas that of cox1 was the slowest, consistent with previous studies [58,66].These results indicate that M. indica evolution may have been influenced by natural selection.Except for trnS1, the 21 tRNA genes in M. indica had typical clover-shaped secondary structures common to many insect species.Some atypical base pairings, such as the G-U pairing, were observed in 22 tRNA genes and two rRNA genes of M. indica.These non-Watson-Crick pairings can be converted into fully functional proteins via post-transcriptional mechanisms [67,68].The structures of the tRNA genes are more conserved in Pentatomidae than those of the rRNA genes.
The phylogenetic trees we constructed were similar to those constructed via traditional morphological studies (Rider et al. 2018) [2].M. indica has a close genetic relationship with A. musgravei and N. typica was the first species to differentiate from Pentatomidae.These same results were obtained by Lian et al. (2022) [66] and Ding et al. (2023) [65].Our study results rejected the monophyly of Pentatominae and Podopinae, and supported Asopinae as monophyletic.Our results agreed with those of Lian et al. (2022) [66], who supported the monophyletic group of Phyllocephalinae.The monophyly of Eysarcorini and Strachiini is supported in many studies [24,53,65].Halyini and Caystrini are closely related, forming a stable sister group relationship.Ina study by Li et al. (2021) [53], Nezarini and Antestiini were clustered on the same branch, which differs from the results of this study.The relationship between Nezarini and Antestiini remains unclear.In addition, the classification status of Pentatomini, Antestini, and Nezarini was unstable, and more attention should be paid to these tribes in terms of their morphology and molecules.Therefore, more taxa are required to better explain the phylogenetic relationships of the Pentatomidae.The molecular clock method was used to estimate origin and divergence time of each species and to further explore the evolutionary history of Pentatomidae.Pentatomidae species originated in the Cretaceous period of the Mesozoic era, whereas M. indica originated in the Paleogene period of the Cenozoic era.In addition, in the evolutionary history of Pentatomidae, a special type of predatory bug has arisen feeding habits have undergone corresponding changes that may be related to environmental changes.This evolutionary history requires further research.
This study is the first to sequence the M. indica mitochondrial genome, and provides a theoretical basis for the phylogenetic relationships and evolutionary history of Pentatomidae.Due to the relatively small number of mitochondrial genomes in Pentatomidae, research on the phylogenetic relationships among Pentatomidae is limited and cannot provide good taxonomic position.Therefore, further research is needed to increase the number of mitochondrial genomes in Pentatomidae species and to further elucidate the phylogenetic relationships among Pentatomidae by combining morphological and biological characteristics.